弱监督的点云语义分割方法需要1 \%或更少的标签,希望实现与完全监督的方法几乎相同的性能,这些方法最近引起了广泛的研究关注。该框架中的一个典型解决方案是使用自我训练或伪标记来从点云本身挖掘监督,但忽略了图像中的关键信息。实际上,在激光雷达场景中广泛存在相机,而这种互补信息对于3D应用似乎非常重要。在本文中,我们提出了一种用于3D分割的新型交叉模式弱监督的方法,并结合了来自未标记图像的互补信息。基本上,我们设计了一个配备有效标签策略的双分支网络,以最大程度地发挥标签的力量,并直接实现2D到3D知识转移。之后,我们以期望最大(EM)的视角建立了一个跨模式的自我训练框架,该框架在伪标签估计和更新参数之间进行了迭代。在M-Step中,我们提出了一个跨模式关联学习,通过增强3D点和2D超级像素之间的周期矛盾性,从图像中挖掘互补的监督。在E-Step中,伪标签的自我校准机制被得出过滤噪声标签,从而为网络提供了更准确的标签,以进行全面训练。广泛的实验结果表明,我们的方法甚至优于最先进的竞争对手,而少于1 \%的主动选择注释。
translated by 谷歌翻译
Fine-grained semantic segmentation of a person's face and head, including facial parts and head components, has progressed a great deal in recent years. However, it remains a challenging task, whereby considering ambiguous occlusions and large pose variations are particularly difficult. To overcome these difficulties, we propose a novel framework termed Mask-FPAN. It uses a de-occlusion module that learns to parse occluded faces in a semi-supervised way. In particular, face landmark localization, face occlusionstimations, and detected head poses are taken into account. A 3D morphable face model combined with the UV GAN improves the robustness of 2D face parsing. In addition, we introduce two new datasets named FaceOccMask-HQ and CelebAMaskOcc-HQ for face paring work. The proposed Mask-FPAN framework addresses the face parsing problem in the wild and shows significant performance improvements with MIOU from 0.7353 to 0.9013 compared to the state-of-the-art on challenging face datasets.
translated by 谷歌翻译
Deep unfolding networks (DUNs) have proven to be a viable approach to compressive sensing (CS). In this work, we propose a DUN called low-rank CS network (LR-CSNet) for natural image CS. Real-world image patches are often well-represented by low-rank approximations. LR-CSNet exploits this property by adding a low-rank prior to the CS optimization task. We derive a corresponding iterative optimization procedure using variable splitting, which is then translated to a new DUN architecture. The architecture uses low-rank generation modules (LRGMs), which learn low-rank matrix factorizations, as well as gradient descent and proximal mappings (GDPMs), which are proposed to extract high-frequency features to refine image details. In addition, the deep features generated at each reconstruction stage in the DUN are transferred between stages to boost the performance. Our extensive experiments on three widely considered datasets demonstrate the promising performance of LR-CSNet compared to state-of-the-art methods in natural image CS.
translated by 谷歌翻译
最近,后门攻击已成为对深神经网络(DNN)模型安全性的新兴威胁。迄今为止,大多数现有研究都集中于对未压缩模型的后门攻击。尽管在实际应用中广泛使用的压缩DNN的脆弱性尚未得到利用。在本文中,我们建议研究和发展针对紧凑型DNN模型(RIBAC)的强大和不可感知的后门攻击。通过对重要设计旋钮进行系统分析和探索,我们提出了一个框架,该框架可以有效地学习适当的触发模式,模型参数和修剪口罩。从而同时达到高触发隐形性,高攻击成功率和高模型效率。跨不同数据集的广泛评估,包括针对最先进的防御机制的测试,证明了RIBAC的高鲁棒性,隐身性和模型效率。代码可从https://github.com/huyvnphan/eccv2022-ribac获得
translated by 谷歌翻译
红外小目标检测是在地球观测,军事侦察,救灾等许多领域的重要问题,最近受到了广泛的关注。本文介绍了注意引导金字塔上下文网络(AGPCNET)算法。其主要组件是注意引导的上下文块(AGCB),上下文金字塔模块(CPM)和非对称融合模块(AFM)。AGCB将特征映射分为修补程序以计算本地关联,并使用全局上下文注意(GCA)来计算语义之间的全局关联,CPM集成来自多尺度AGCB的功能,AFM从功能集成了低级和深级语义集成 - 融合视角,增强了特征的利用。实验结果表明,AGPCNET在两个可用的红外小目标数据集上实现了新的最先进的性能。源代码可在https://github.com/tianfang-zhang/agpcnet上获得。
translated by 谷歌翻译
基于混合的点云增强是一种流行的大规模公共数据集可用性问题的问题。但混合点和相应的语义标签之间的不匹配会阻碍诸如部分分割的方向任务中的进一步应用。本文提出了一种点云增强方法,Pointmanifoldcut(PMC),它取代了神经网络嵌入点,而不是欧几里德空间坐标。这种方法利用了在较高级别的神经网络的点已经培训,以培训以嵌入其邻居关系并混合这些表示不会混合自身与其标签之间的关系。我们在PointManifoldCut操作后设置了空间变换模块,以对齐嵌入式空间中的新实例。本文还讨论了不同隐藏层的效果和更换点的方法。实验表明,我们的建议方法可以增强点云分类以及分段网络的性能,并为攻击和几何变换带来了额外的鲁棒性。本文的代码可用于:https://github.com/fun0515/pinityManifoldcut。
translated by 谷歌翻译
我们提出了一种新的非参数混合物模型,用于多变量回归问题,灵感来自概率K-Nearthimest邻居算法。使用有条件指定的模型,对样本外输入的预测基于与每个观察到的数据点的相似性,从而产生高斯混合物表示的预测分布。在混合物组件的参数以及距离度量标准的参数上,使用平均场变化贝叶斯算法进行后推断,并具有基于随机梯度的优化过程。在与数据大小相比,输入 - 输出关系很复杂,预测分布可能偏向或多模式的情况下,输入相对较高的尺寸,该方法尤其有利。对五个数据集进行的计算研究,其中两个是合成生成的,这说明了我们的高维输入的专家混合物方法的明显优势,在验证指标和视觉检查方面都优于竞争者模型。
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
Given the increasingly intricate forms of partial differential equations (PDEs) in physics and related fields, computationally solving PDEs without analytic solutions inevitably suffers from the trade-off between accuracy and efficiency. Recent advances in neural operators, a kind of mesh-independent neural-network-based PDE solvers, have suggested the dawn of overcoming this challenge. In this emerging direction, Koopman neural operator (KNO) is a representative demonstration and outperforms other state-of-the-art alternatives in terms of accuracy and efficiency. Here we present KoopmanLab, a self-contained and user-friendly PyTorch module of the Koopman neural operator family for solving partial differential equations. Beyond the original version of KNO, we develop multiple new variants of KNO based on different neural network architectures to improve the general applicability of our module. These variants are validated by mesh-independent and long-term prediction experiments implemented on representative PDEs (e.g., the Navier-Stokes equation and the Bateman-Burgers equation) and ERA5 (i.e., one of the largest high-resolution data sets of global-scale climate fields). These demonstrations suggest the potential of KoopmanLab to be considered in diverse applications of partial differential equations.
translated by 谷歌翻译
Rankings are widely collected in various real-life scenarios, leading to the leakage of personal information such as users' preferences on videos or news. To protect rankings, existing works mainly develop privacy protection on a single ranking within a set of ranking or pairwise comparisons of a ranking under the $\epsilon$-differential privacy. This paper proposes a novel notion called $\epsilon$-ranking differential privacy for protecting ranks. We establish the connection between the Mallows model (Mallows, 1957) and the proposed $\epsilon$-ranking differential privacy. This allows us to develop a multistage ranking algorithm to generate synthetic rankings while satisfying the developed $\epsilon$-ranking differential privacy. Theoretical results regarding the utility of synthetic rankings in the downstream tasks, including the inference attack and the personalized ranking tasks, are established. For the inference attack, we quantify how $\epsilon$ affects the estimation of the true ranking based on synthetic rankings. For the personalized ranking task, we consider varying privacy preferences among users and quantify how their privacy preferences affect the consistency in estimating the optimal ranking function. Extensive numerical experiments are carried out to verify the theoretical results and demonstrate the effectiveness of the proposed synthetic ranking algorithm.
translated by 谷歌翻译